Getting the Best from Uncertain Data

نویسندگان

  • Ilaria Bartolini
  • Paolo Ciaccia
  • Marco Patella
چکیده

The skyline of a relation is the set of tuples that are not dominated by any other tuple in the same relation, where tuple u dominates tuple v if u is no worse than v on all the attributes of interest and strictly better on at least one attribute. Previous attempts to extend skyline queries to probabilistic databases have proposed either a weaker form of domination, which is unsuitable to univocally define the skyline, or a definition that implies algorithms with exponential complexity. In this paper we demonstrate how, given a semantics for linearly ranking probabilistic tuples, the skyline of a probabilistic relation can be univocally defined. Our approach preserves the three fundamental properties of skyline: 1) it equals the union of all top-1 results of monotone scoring functions, 2) it requires no additional parameter to be specified, and 3) it is insensitive to actual attribute scales. We also detail efficient sequential and index-based algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Getting the Best from Uncertain Data: the Correlated Case

In this extended abstract we apply the notion of skyline to the case of probabilistic relations including correlation among tuples. In particular, we consider the relevant case of the x-relation model, consisting of a set of generation rules specifying the mutual exclusion of tuples. We show how our definitions apply to different ranking semantics and analyze the time complexity for the resolut...

متن کامل

The challenge of getting a high quality of RNA from oocyte for gene expression study

The extraction of intact RNA from oocyte is quite challenging and time-consuming. A standard protocol using commercial RNA extraction kit, yields a low quantity of RNA in oocytes. In the past, several attempts in getting RNA for gene expression study ended up with a few different modified methods. Extraction of high-quality RNA from oocyte is important before further downstream analyses such as...

متن کامل

Sensitivity Analysis of Spatial Sampling Designs for Optimal Prediction

In spatial statistic, the data analyzed which is correlated and this correlation is due to their locations in the studied region. Such correlation that is related to distance between observations is called spatial correlation. Usually in spatial data analysis, the prediction of the amount of uncertain quantity in arbitrary 4locations of the area is considered according to attained observations ...

متن کامل

‎A Bayesian mixture model‎ for classification of certain and uncertain data

‎There are different types of classification methods for classifying the certain data‎. ‎All the time the value of the variables is not certain and they may belong to the interval that is called uncertain data‎. ‎In recent years‎, ‎by assuming the distribution of the uncertain data is normal‎, ‎there are several estimation for the mean and variance of this distribution‎. ‎In this paper‎, ‎we co...

متن کامل

Robust Economic-Statistical Design of Acceptance Control Chart

Acceptance control charts (ACC), as an effective tool for monitoring highly capable processes, establish control limits based on specification limits when the fluctuation of the process mean is permitted or inevitable. For designing these charts by minimizing economic costs subject to statistical constraints, an economic-statistical model is developed in this paper. However, the parameters of s...

متن کامل

Clustering of Uncertain Data Objects using Improved K-means Algorithm

Recently data mining over the uncertain data attracts more attention of the data mining. The uncertainty occurs in a information because of the inaccurate measurement of the results, like scientific results, data gathered from sensor network, measuring temperature, humidity, pressure and so on. from such a sources there is possibility of getting the uncertainty in a data. Main task is to handle...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011